mmg_233_2014_genetics_genomicsfandomcom-20200215-history
Protein Data Bank
Introduction The protein data bank (PDB) is a depository for all known 3-D macromolecular structures. There is an enormous amount of data available through this database. Along with the coordinates and structure factors, information from other databases is easily accessible (i.e. BLAST, Uniprot, SCOP, etc.). This information is found directly on the site but you can also click on the link and it takes you directly to that database. There are many different types of structures now, including biological and synthetic sequences. Along with X-ray crystallography, there are nuclear magnetic resonance (NMR), and cryoelectron microscopy (cryoEM), data sets. The magnitude of data and different experimental methods have made the database be dynamic. It needs to change the way it processes and validates data based upon the method of collection. Many different scientific fields benefit from the PDB including but not limited to: structural biology, drug design, bioinformatics, and proteomics. The PDB continues to grow exponentially. Brief History The idea of a collection of structural data first came about in the early 1970's. The field had decided that there was a need for an international repository to hold the data for sharing purposes. At that time data was collected on film and each atom had its own card, most structures had at least 1000 cards. This made sharing data very challenging and slow. In 1971 Walter Hamilton managed the first depository at Brookhaven National Laboratory. Tom Koetlz took over after his passing. In 1989 generally guidelines for depositing data were determined by the field. The Research Collabroative of Structural Biology began managing the database in 1999. In 2003, the Worldwide PDB was formed creating a union between the United States, Europe, and Japan.http://dx.doi.org/10.1016/j.febslet.2012.12.029PMID:18156675 How to use the PDB The homepage for the PDB is shown in figure one. From the homepage you can search for structures in multiple ways: PDB id, molecule name, authors etc (fig 2). Each structure is given a unique 4 character code, usually a numerical character followed by three letters. You can also explore the database by looking at experimental method, organism, SCOP, resolution, etc. (fig 3). Another fun way to learn about proteins is the molecule of the month. Each month there is a blurb about an interesting protein on the homepage (fig 1). Lets start exploring the database more in depth. I choose to search for Neil1, shown in figure 2. In the search bar you can search quickly. There is also an advanced search link if you want to be very specific. The results for my search are shown in figure 4. The results are generally shown in chronolgical order. On the left is the PDB code. There is helpful information on the left, including: authors, resolution, ligands bound, paper title (if published), and release date; to aid in selecting the best structure. I selected PDB code 1TDH. It then brings you to a page all about 1TDH (fig. 4) From this page you can download the model to look at yourself in pymol, coot, or your favorite structure viewing program. There is also a bar at the top the page full of extremely helpful tabs. From this page you can find out almost everything about the molecule of interest. Under the summary tab you can find the abstract (fig 5) as well as basic information about the protein and model (fig. 6). This is also where you can download the structure factors (fig 6). Once you have exhausted the summary tab you can move onto the 3D-view tab (fig. 7). This tab is not necessary if you plan to download the files to view on your own. Here you can look at the model using Jmol. Moving along to the sequence tab, here you can learn a lot (fig. 8). The sequence is displayed along with the secondary structure determined in the model. The annotations tab links you to other databases that help in classification of proteins based on different similarities (fig. 9). Figure 10 shows the sequence similarities, this links you to blast. Lastly you can find out information about the structural keywords, protein (Uniprot), and gene details (i.e. interesting SNPs), shown in figures 11, 12 and 13, respectively. Use in Bioinformatics Kulkarni-Kale U., Raskar-Renuse, S., Natekar-Kalantre, G., and Saxena, S.A. 2014. Anitgen-Antibody Interaction Database (AgAbDb): a compendium of antigen-antibody interactions. 1184:149-164. PMID:2508123 Kumarm A., Bhandari, A., and Krishnaswamy S. 2014. Sequence and Strucutral Perspectives of Bacterial B-stranded Porins. Protein Pept Lett. PMID:25159510 References 1 Berman H. et al. 2012. FEBS Letters. DOI 2 Berman HM. 2008. Acta Cystallogr A. 64:88-95 PMID 18156675